Search CORE

137 research outputs found

Face Attribute Prediction Using Off-the-Shelf CNN Features

Author: Li Haibo
Sullivan Josephine
Zhong Yang
Publication venue
Publication date: 01/01/2016
Field of study

Predicting attributes from face images in the wild is a challenging computer vision problem. To automatically describe face attributes from face containing images, traditionally one needs to cascade three technical blocks --- face localization, facial descriptor construction, and attribute classification --- in a pipeline. As a typical classification problem, face attribute prediction has been addressed using deep learning. Current state-of-the-art performance was achieved by using two cascaded Convolutional Neural Networks (CNNs), which were specifically trained to learn face localization and attribute description. In this paper, we experiment with an alternative way of employing the power of deep representations from CNNs. Combining with conventional face localization techniques, we use off-the-shelf architectures trained for face recognition to build facial descriptors. Recognizing that the describable face attributes are diverse, our face descriptors are constructed from different levels of the CNNs for different attributes to best facilitate face attribute prediction. Experiments on two large datasets, LFWA and CelebA, show that our approach is entirely comparable to the state-of-the-art. Our findings not only demonstrate an efficient face attribute prediction approach, but also raise an important question: how to leverage the power of off-the-shelf CNN representations for novel tasks.Comment: In proceeding of 2016 International Conference on Biometrics (ICB

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Leveraging Mid-Level Deep Representations For Predicting Face Attributes in the Wild

Author: Li Haibo
Sullivan Josephine
Zhong Yang
Publication venue
Publication date: 01/01/2016
Field of study

Predicting facial attributes from faces in the wild is very challenging due to pose and lighting variations in the real world. The key to this problem is to build proper feature representations to cope with these unfavourable conditions. Given the success of Convolutional Neural Network (CNN) in image classification, the high-level CNN feature, as an intuitive and reasonable choice, has been widely utilized for this problem. In this paper, however, we consider the mid-level CNN features as an alternative to the high-level ones for attribute prediction. This is based on the observation that face attributes are different: some of them are locally oriented while others are globally defined. Our investigations reveal that the mid-level deep representations outperform the prediction accuracy achieved by the (fine-tuned) high-level abstractions. We empirically demonstrate that the midlevel representations achieve state-of-the-art prediction performance on CelebA and LFWA datasets. Our investigations also show that by utilizing the mid-level representations one can employ a single deep network to achieve both face recognition and attribute prediction.Comment: In proceedings of 2016 International Conference on Image Processing (ICIP

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

The realism in the poetry of John Masefield

Author: Sullivan Josephine Julia
Publication venue: Boston University
Publication date: 01/01/1928
Field of study

This item was digitized by the Internet Archive. Thesis (M.A.)--Boston Universityhttps://archive.org/details/therealisminpoet00sul

Boston University Institutional Repository (OpenBU)

CNN Features off-the-shelf: an Astounding Baseline for Recognition

Author: Azizpour Hossein
Carlsson Stefan
Razavian Ali Sharif
Sullivan Josephine
Publication venue
Publication date: 01/01/2014
Field of study

Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the \overfeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the \overfeat network was trained to solve. Astonishingly, we report consistent superior results compared to the highly tuned state-of-the-art systems in all the visual classification tasks on various datasets. For instance retrieval it consistently outperforms low memory footprint methods except for sculptures dataset. The results are achieved using a linear SVM classifier (or

L2

distance in case of retrieval) applied to a feature representation of size 4096 extracted from a layer in the net. The representations are further modified using simple augmentation techniques e.g. jittering. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.Comment: version 3 revisions: 1)Added results using feature processing and data augmentation 2)Referring to most recent efforts of using CNN for different visual recognition tasks 3) updated text/captio

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Predicting success in tenth grade geometry.

Author: Sullivan Margaret Josephine
Publication venue: Boston University
Publication date: 01/01/1947
Field of study

Thesis (M.A.)--Boston Universit

Boston University Institutional Repository (OpenBU)

Contrastive pretraining for semantic segmentation is robust to noisy positive pairs

Author: Gerard Sebastian
Sullivan Josephine
Publication venue
Publication date: 24/11/2022
Field of study

Domain-specific variants of contrastive learning can construct positive pairs from two distinct images, as opposed to augmenting the same image twice. Unlike in traditional contrastive methods, this can result in positive pairs not matching perfectly. Similar to false negative pairs, this could impede model performance. Surprisingly, we find that downstream semantic segmentation is either robust to the noisy pairs or even benefits from them. The experiments are conducted on the remote sensing dataset xBD, and a synthetic segmentation dataset, on which we have full control over the noise parameters. As a result, practitioners should be able to use such domain-specific contrastive methods without having to filter their positive pairs beforehand.Comment: 8 pages, 8 figure

arXiv.org e-Print Archive

Persistent Evidence of Local Image Properties in Generic ConvNets

Author: Azizpour Hossein
Carlsson Stefan
Ek Carl Henrik
Maki Atsuto
Razavian Ali Sharif
Sullivan Josephine
Publication venue
Publication date: 24/11/2014
Field of study

Supervised training of a convolutional network for object classification should make explicit any information related to the class of objects and disregard any auxiliary information associated with the capture of the image or the variation within the object class. Does this happen in practice? Although this seems to pertain to the very final layers in the network, if we look at earlier layers we find that this is not the case. Surprisingly, strong spatial information is implicit. This paper addresses this, in particular, exploiting the image representation at the first fully connected layer, i.e. the global image descriptor which has been recently shown to be most effective in a range of visual recognition tasks. We empirically demonstrate evidences for the finding in the contexts of four different tasks: 2d landmark detection, 2d object keypoints prediction, estimation of the RGB values of input image, and recovery of semantic label of each pixel. We base our investigation on a simple framework with ridge rigression commonly across these tasks, and show results which all support our insight. Such spatial information can be used for computing correspondence of landmarks to a good accuracy, but should potentially be useful for improving the training of the convolutional nets for classification purposes

arXiv.org e-Print Archive

Publikationer från KTH

CiteSeerX

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Explore Bristol Research

New York\u27s All-Payer Database: A New Lens for Consumer Transparency

Author: Miller Patrick
Peters Ashley
Porter Josephine B.
Sullivan Emily
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/09/2015
Field of study

UNH Scholars' Repository